Information processing apparatus and control method thereof

ABSTRACT

An information processing apparatus comprises: a registration unit adapted to register information required to determine at least one specific pattern in an image; an input unit adapted to input image data; a first generation unit adapted to extract a predetermined feature distribution from the input image data, and to generate a first feature distribution map indicating the feature distribution; a second generation unit adapted to generate a second feature distribution map by applying a conversion required to relax localities of the first feature distribution map to the first feature distribution map; and a determination unit adapted to determine, using sampling data on the second feature distribution map and the registered information, which of the specific patterns the image data matches.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an information processing apparatuswhich executes high-speed pattern identification of information, and acontrol method thereof.

2. Description of the Related Art

As a high-speed pattern identification method, various methods have beenproposed so far. For example, in association with a target objectdiscrimination method, Japanese Patent Laid-Open No. 2009-26326 hasproposed a method of discriminating based on a feature amount based onthe difference between pixel luminance values at two positions whetheror not an input grayscale image is a target object. This method canattain high-speed pattern identification since it uses a high-speed weakclassifier using sampling data in the input grayscale data, that is,pixel luminance values at two positions. However, with this method, thedifference between the pixel luminance values at the two positions usedas the feature amounts is robust against bias variations of luminancevalues, but it is not robust against, for example, contrast variationscaused by shading.

In association with pattern identification which is robust against suchcontrast variations, for example, Japanese Patent Laid-Open No.2009-43184 has proposed a method which attains pattern identificationusing an edge image generated by applying edge extraction to inputgrayscale image data. With this method, features such as edges areextracted in advance, and pattern identification processing is appliedto that data, thus allowing pattern identification which is robustagainst various variations. However, feature amounts such as edgesgenerally tend to form a coarse distribution. Therefore, since variouspieces of combination information of sampling data have no significantdifferences, a sufficiently high pattern identification performancecannot be consequently obtained.

As described above, a pattern identification method, which is robustagainst, for example, contrast variations caused by shading in imagedata, and allows high-speed processing, is demanded.

The present invention provides, in consideration of the above situation,an information processing apparatus which identifies a specific patternat high speed using sampling data in image data, and attains high-speedidentification which is robust against various variations.

SUMMARY OF THE INVENTION

According to one aspect of the present invention, there is provided aninformation processing apparatus comprising: a registration unit adaptedto register information required to determine at least one specificpattern in an image; an input unit adapted to input image data; a firstgeneration unit adapted to extract a predetermined feature distributionfrom the input image data, and to generate a first feature distributionmap indicating the feature distribution; a second generation unitadapted to generate a second feature distribution map by applying aconversion required to relax localities of the first featuredistribution map to the first feature distribution map; and adetermination unit adapted to determine, using sampling data on thesecond feature distribution map and the registered information, which ofthe specific patterns the image data matches.

Further features of the present invention will be apparent from thefollowing description of exemplary embodiments with reference to theattached drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram showing an arrangement associated withprocessing of a pattern identification method according to the firstembodiment;

FIG. 2 is a flowchart showing the processing of the patternidentification method according to the first embodiment;

FIG. 3 shows examples of an input image and processing results accordingto the first embodiment;

FIGS. 4A and 4B illustrate an edge extraction image and edgedistribution diffused image according to the first embodiment;

FIG. 5 is a block diagram showing an arrangement associated withprocessing of a pattern identification method according to the secondembodiment;

FIG. 6 is a flowchart showing the processing of the patternidentification method according to the second embodiment;

FIG. 7 shows examples of an input image and processing results accordingto the second embodiment;

FIG. 8 illustrates a structure of a decision tree according to thesecond and third embodiments;

FIG. 9 is a block diagram showing an arrangement associated withprocessing of a pattern identification method according to the thirdembodiment;

FIG. 10 is a flowchart showing the processing of the patternidentification method according to the third embodiment;

FIG. 11 shows examples of an input image and processing resultsaccording to the third embodiment;

FIG. 12 shows a practical example of a distance map according to thesecond embodiment;

FIGS. 13A and 13B illustrate an edge extraction image and distance mapaccording to the second embodiment;

FIG. 14 is a flowchart showing feature distribution map generationprocessing according to the third embodiment; and

FIG. 15 is a flowchart showing feature distribution diffused mapgeneration processing according to the third embodiment.

DESCRIPTION OF THE EMBODIMENTS

An exemplary embodiment(s) of the present invention will now bedescribed in detail with reference to the drawings. It should be notedthat the relative arrangement of the components, the numericalexpressions and numerical values set forth in these embodiments do notlimit the scope of the present invention unless it is specificallystated otherwise.

First Embodiment

The first embodiment will explain a pattern identification method whichuses two-dimensional grayscale image data of 20×20 pixels as input data,and identifies whether or not the image data is an image of a specificcategory. The specific category includes, for example, a human facecategory and a category of a part having a certain defect. Thisembodiment will explain a case in which a specific pattern in which aportion near the center of a human face is located nearly at the centerof an input image (to be referred to as a “face pattern” hereinafter) isused as the specific category.

FIG. 1 shows the functional arrangement of an information processingapparatus which identifies a specific pattern according to the firstembodiment. FIG. 2 shows the processing sequence of a specific patternidentification method according to the first embodiment. Functions shownin FIG. 1 may be implemented when, for example, a computer executesprograms stored in a memory, in addition to hardware implementation. Anexample of the information processing apparatus which attains patternidentification will be described below with reference to FIGS. 1 and 2.

Referring to FIG. 1, a registration pattern information input unit 100inputs information associated with a specific category, that is, a facepattern. A registration pattern information database 101 records andholds the information associated with the face pattern. In thisembodiment, each weak classifier discriminates whether or not adifference value between pixel luminance values at two positions asfeature amounts is equal to or larger than a predetermined threshold,and final pattern identification is implemented by integratingdiscrimination results of weak classifiers. Thus, the registrationpattern information input unit 100 inputs a plurality of sets ofinformation as those for weak classifiers, and records and holds them inthe registration pattern information database 101. Each set ofinformation includes:

(1) pieces of position information of two points between which adifference is to be calculated;

(2) a threshold for the difference at that time; and

(3) a score associated with a discrimination result of each weakclassifier.

The registration pattern information input unit 100 also inputs anaccumulated score threshold used to finally determine whether or not aninput pattern is a pattern of the specific category, and records andholds it in the registration pattern information database 101. Theprocessing in these units corresponds to “input registration patterninformation” in step S200 in FIG. 2.

As a plurality of pieces of information for each weak classifier, piecesof information obtained by ensemble learning using a large number offace patterns and patterns which are not face patterns (to be referredto as “non-face patterns” hereinafter) are used. However, in theensemble learning, two-dimensional grayscale image data intact is notused, but data obtained by applying edge extraction processing andsmoothing processing (to be described later) to the two-dimensionalgrayscale image is used. In this embodiment, an AdaBoost method is usedas the ensemble learning method. However, the present invention is notlimited to this specific method. For example, other methods such as RealAdaBoost may be used.

The accumulated score threshold required to finally determine whether ornot an input pattern is a pattern of the specific category generallyuses “0” in case of AdaBoost. However, when, for example, an incorrectdetection, that is, a detection of a non-face pattern as a face patternis to be suppressed, a relatively high accumulated score may be set. Inthis way, the accumulated score can be appropriately set according to arequired performance.

An image input unit 10 inputs a grayscale image of 20×20 pixels (to bereferred to as an “input image” hereinafter) used to identify whether ornot that image matches the specific category, that is, the face pattern.More specifically, a face pattern 30 of a grayscale image shown in FIG.3 is input. The processing in this unit corresponds to “input image” instep S20 in FIG. 2.

An edge extraction processing unit 11 generates an edge extraction imageby applying edge extraction processing to the input image input by theimage input unit 10 (first generation). As an example of this edgeextraction processing, a Laplacian of Gaussian (LoG) filter is used.More specifically, noise removal using a Gaussian filter is applied tothe input image, and a Laplacian filter is applied to the image whichhas undergone the noise removal. A filter obtained by compositing theGaussian filter and Laplacian filter may be used. In general edgeextraction using the LoG filter, zero crossing points after applicationof the LoG filter are extracted as edges. However, in this embodiment,output values of the LoG filter, which are equal to or larger than apredetermined threshold, are simply extracted as edges.

More specifically, the following processing is executed for the imageobtained after application of the LoG filter. That is, a pixel value ofeach pixel of that image is compared with a predetermined threshold.When the pixel value is smaller than the threshold, the pixel value atthat position is changed to “0”. Note that this embodiment uses “0” asthe predetermined threshold used in this processing.

A face pattern 31 in FIG. 3 (first feature distribution map) is anapplication example of this edge extraction processing to the image 30shown in FIG. 3. When the LoG filter is applied to a general image,roughly more than half output values tend to assume values equal to orsmaller than “0”. For this reason, with the processing in this unit, anedge extraction image in which pixel values assume “0” at a large numberof positions on the image (on the first feature distribution map) otherthan edge positions is obtained, as indicated by the face pattern 31 inFIG. 3. The processing in this unit corresponds to “edge extractionprocessing” in step S21 in FIG. 2. This embodiment executes theaforementioned edge extraction using the LoG filter, but the presentinvention is not limited to this. For example, edge extraction using aSobel filter may be executed.

A smoothing processing unit 12 applies smoothing processing using aGaussian filter to the edge extraction image generated by the edgeextraction processing unit 11, thereby generating an edge distributiondiffused image, that is, an edge image (second feature distribution map)in which degrees of locality of edge features are relaxed (secondgeneration). In this smoothing processing using the Gaussian filter, aGaussian filter which has a larger smoothing width than the generalGaussian filter for noise removal in the aforementioned LoG filter isapplied. Normally, for the purpose of noise removal of an image of about20×20 pixels, it is common practice to apply a Gaussian filter having,for example, a filter size of 3×3 pixels, and σ=0.5. By contrast, thisembodiment executes the smoothing processing using a Gaussian filterwhich has a filter size of 7×7 pixels and σ=1.5, that is, which has awidth roughly three times the aforementioned Gaussian filter. Theprocessing in this unit corresponds to the “smoothing processing” instep S22 in FIG. 2.

A face pattern 32 (second feature distribution map) in FIG. 3 is animage obtained by applying the smoothing processing to the edgeextraction image indicated by the face pattern 31. As shown in thisexample, this smoothing processing executes smoothing processing havinga larger width than that which primarily aims at noise removal. That is,in addition to noise removal, the smoothing processing in this unitprimarily aims at spatially diffusing the previously extracted edges,that is, pieces of feature information to relax localities of features.

The locality is of a nature that a feature signal, that is, a signal,which numerically expresses the presence/absence of an existence of afeature, exists in only a local range where that feature exists. Afeature such as the previously extracted edge has such locality.However, as a result of the aforementioned smoothing processing, even aposition where no feature signal exists may assume a signal valueinfluenced by a feature which exists in the neighborhood of thatposition. Therefore, the locality of this feature information isrelaxed. For this purpose, as a smoothing width, a width based on thecharacteristics of, for example, a standard feature distribution of theregistration pattern, that is, a standard edge distribution of the facepattern, is set in place of the width required to remove noise.

More specifically, for example, when the standard feature distributionis coarse like in the example of the face pattern 31 shown in FIG. 3, alarge smoothing width is set up to circumferential positions where nofeatures exist on an original feature extraction result. For example, asshown in the example of the face pattern 32, a large smoothing width isset to diffuse the hems of feature information up to regions near thecheeks of a face on the image (on the second feature distribution map).

However, when smoothing using an excessively large width is applied,even features, which exist at spatially sufficiently distant positions,may be mixed up by the smoothing processing, and the finalidentification precision may consequently deteriorate. For this reason,especially according to the density of the standard featuredistribution, a smoothing width is reduced so as to prevent featureswhich exist at spatially sufficiently distant positions from beingcomposited into a single feature. Such a smoothing width can beempirically set according to the standard feature distribution of theregistration pattern and the required pattern identification precision.

More specifically, for example, when a two-point difference value isused as information of each weak classifier like in this embodiment, asmoothing width (a range within which a hem of diffused featureinformation can reach) can be searched within a range which is largerthan a minimum value of a two-point distance and is smaller than amaximum value of the distance. Note that this embodiment sets aparameter (smoothing width) used in a conversion required to relaxdegrees of locality of features as a fixed parameter in advance.However, the parameter used in the conversion required to relax thedegrees of locality of features may be dynamically set depending on dataas a processing target of the pattern identification. For example, theparameter may be dynamically set according to an average inter-peakdistance of edge intensities extracted from the input image as aprocessing target of the pattern identification. Such a method ofdynamically setting the parameter will be described in the thirdembodiment.

As described above, in the pattern identification method of thisembodiment, an edge distribution diffused image in which pieces ofextracted feature information are spatially diffused is generated as inthe aforementioned smoothing processing, and the pattern identificationis executed using the edge distribution diffused image. Note that thisembodiment executes the smoothing processing using the Gaussian filter.However, the present invention is not limited to this, and any othermethods may be used as long as they spatially diffuse pieces ofextracted feature information to relax the localities of features. Asfor smoothing processing using a spatial filter, it is desirable toexecute smoothing processing using a filter in which circumferentialweights are set to be smaller than a central weight. This embodimentuses an isotropic smoothing filter. However, the present invention isnot limited to such a specific filter. For example, when the standardfeature distribution is coarse in a vertical direction and dense in ahorizontal direction, a vertically elongated Gaussian filter may beused. That is, a smoothing filter having a shape based on the standardfeature distribution may be used.

The aforementioned processes in the edge extraction processing unit 11and smoothing processing unit 12 are applied to each of a large numberof face patterns and non-face patterns used in the ensemble learning,and the learning is made using the data to which these processes havebeen applied. In this case, it is desired that a parameter such as thesmoothing width in these processes which are applied to a large numberof patterns used in the learning matches that of processes to be appliedin an actual pattern identification stage.

Upon completion of the generation processing of the edge distributiondiffused image in the smoothing processing unit 12 shown in FIG. 1,pattern identification processing is executed using this edgedistribution diffused image. A two-point comparison unit 13 in FIG. 1compares a difference value between luminance values of pixels at twopoints in the edge distribution diffused image with a threshold based onthe information of each weak classifier held in the registration patterninformation database 101. When the difference value between theluminance values of the pixels at the two points is equal to or largerthan the threshold, the two-point comparison unit 13 sends a scorecorresponding to that weak classifier to a score addition/subtractionunit 14. The score addition/subtraction unit 14 adds that score to anaccumulated score. Conversely, when the difference value of the twopoints is equal to or smaller than the threshold, the two-pointcomparison unit 13 sends a score obtained by inverting the sign of thecorresponding score to the score addition/subtraction unit 14, whichadds that score to the accumulated score. This processing corresponds toa subtraction of an original score since its sign is inverted.

The processing sequence in the two-point comparison unit 13 and scoreaddition/subtraction unit 14 will be described below with reference toFIG. 2. In step S230, the two-point comparison unit 13 initializes thenumber of hypotheses to “0”, and also the accumulated score to “0”.Next, in step S231, the two-point comparison unit 13 selects a pluralityof pieces of information of a weak classifier in turn from those of aplurality of weak classifiers held in the registration patterninformation database 101 (the selection order can be arbitrarily set,but redundant selection is avoided). Then, the two-point comparison unit13 compares a difference value between luminance values of pixels atpositions of predetermined two points of sampling points, which are setin advance, of the edge distribution diffused image, with the thresholdbased on the selected information of the weak classifier.

Subsequently, in step S24, based on the comparison result in step S231,the score addition/subtraction unit 14 adds the corresponding score tothe accumulated score when the difference value is equal to or largerthan the threshold, or subtracts the corresponding score from theaccumulated score when the difference value is equal to or smaller thanthe threshold. Furthermore, in step S232, the score addition/subtractionunit 14 increments the number of hypotheses by “1”.

Then, the score addition/subtraction unit 14 determines in step S233whether or not the number of hypotheses reaches the total number (N) ofweak classifiers held in the registration pattern information database101. If the number of hypotheses reaches the total number N, the processadvances to step S25. If the number of hypotheses does not reach thetotal number N, the process returns to step S231 to select informationof a new weak classifier from the registration pattern informationdatabase 101, thus repeating the aforementioned processes.

That is, the processes in steps S231, S24, and S232 are repeated as manytimes as the total number of weak classifiers held in the registrationpattern information database 101. Then, the process advances to the nextpattern determination step S25. The processing sequence in the two-pointcomparison unit 13 and score addition/subtraction unit 14 has beendescribed.

Finally, a pattern determination unit 15 determines whether or not theaccumulated score added and subtracted by the score addition/subtractionunit 14 is equal to or larger than the accumulated score threshold heldin the registration pattern information database 101. Then, when theaccumulated score is equal to or larger than the accumulated scorethreshold, the pattern determination unit 15 determines that the imageinput by the image input unit 10 matches a pattern of the specificcategory, that is, a face pattern in this embodiment. When theaccumulated score is equal to or smaller than the threshold, the patterndetermination unit 15 determines a non-face pattern. The processing inthis unit corresponds to “pattern determination” in step S25 in FIG. 2.Upon completion of the processing in this step, the processing of thepattern identification method of this embodiment ends.

As described above, in the pattern identification method of thisembodiment, features, which exist on input data, are extracted, and datais generated by relaxing the localities of pieces of featureinformation. To this data, identification processing of a pattern isapplied using the classifiers required to implement identification usingsampling data in the data. These classifiers include data which havebeen learned using a large number of data that have undergone similarfeature extraction processing and feature information diffusionprocessing. By executing the identification processing of sampling datausing these classifiers, high-speed pattern identification which isrobust against various data variations can be implemented. In this case,three effects obtained by the processing for diffusing pieces of featureinformation to relax the localities of features, which is a mostcharacteristic feature in the pattern identification method of thisembodiment, will be described below. The first effect is a noise removal(suppression in practice) effect, which is the same as an effect ofgeneral smoothing processing. Such noise removal can generally enhancean identification performance in pattern identification. This effect isa general effect obtained upon application of smoothing processing. Thesecond effect is an effect of improving robustness against variationsof, for example, extracted feature positions, which is the same as aneffect of pooling processing in a feature combined layer. As a result ofimprovement of the robustness against variations of, for example,extracted feature positions, improvement of a generalization performancein the pattern identification can be expected. This effect is also ageneral effect upon application of the smoothing processing.

The third and final effect is a characteristic effect in the patternidentification method of this embodiment, and this effect will bedescribed below with reference to FIGS. 4A and 4B. The abscissa in FIGS.4A and 4B plots a position, and the ordinate plots a degree of featuredistribution. As described above, edges extracted as features in thisembodiment are effective to improve the robustness against variousvariations, but such features tend to generally have a coarsedistribution, that is, high localities. FIG. 4A illustrates thistendency. This embodiment has exemplified the processing fortwo-dimensional data. However, FIG. 4A exemplifies one-dimensional datafor the sake of simplicity. As described above, a feature signal such asan edge, which has a coarse distribution tendency, locally exists at aposition 410, as shown in FIG. 4A, and does not appear at otherpositions, that is, positions where no feature signal exists.

In this case, upon execution of the pattern identification method whichuses sampling data for the purpose of high-speed execution of thepattern identification, that is, the method using a difference valuebetween luminance values of pixels at two points, the followingphenomenon occurs. For example, a two-point difference value, which isindicated by positions 411 in FIG. 4A, can have significant informationindicating a high possibility of existence of a feature signal at theposition of a right point of the positions 411.

However, two-point difference values indicated by positions 412 and 413in FIG. 4A can have information indicating a possibility ofnon-existence of a feature signal at the positions of respective points,but they do not have any information which can distinguish theirdifference. In addition, in case of features as a coarse distribution,that is, a situation in which feature signals locally exist at limitedpositions, a large number of two-point combinations have no informationdifference like the difference values at the positions 412 and 413. Forthis reason, the total number of two-point combinations from whichinformation useful in identification becomes relatively small, and as aresult, a sufficiently high pattern identification performance cannot beobtained.

By contrast, as shown in FIG. 4B, when feature signals are diffused bythe smoothing processing like a position 420 to relax the localities offeatures, two-point difference values at positions 422 and 423, whichcorrespond to the positions 412 and 413, have a difference. As shown inthis example, when the localities of features are relaxed, a largenumber of two-point combinations have various variations of informationcompared to a case in which the localities of features are not relaxed.For this reason, the total number of two-point combinations from whichinformation useful in identification can be obtained becomes relativelylarge, and as a result, a pattern identification performance is morelikely to be improved.

More specifically, a two-point difference value at the positions 423assumes “0” in the same manner as in the case in which a feature signalis not diffused. However, as a result of diffusion of feature signals, atwo-point difference value at the positions 412 assumes “0”, while atwo-point difference value at the corresponding positions 422 assumes anonzero value according to a distance from a pixel position of a peak420. In this way, the number of other two-point combinations whichassume a difference value=0 like the positions 423 is reduced, and as aresult, information indicating that the two-point difference value atthe positions 423 is “0” becomes unique information, thus increasing aninformation volume.

From a qualitative viewpoint, such information can be that indicating ahigh possibility of non-existence of a feature in the neighborhood oftwo points at the positions 423. Also, for example, a two-pointdifference value at the positions 422 can be information indicating thata feature exists at the position of a right point of the two points, ora feature exists in a direction of the right point. In this manner, thetwo-point difference value at the positions 422 additionally hasinformation indicating a possibility of existence of a feature on theright side, compared to the two-point difference value at the positions412.

In this embodiment, the pattern determination processing is executed inthe pattern determination step S25 in FIG. 2 based on a score obtainedby adding/subtracting, in the score addition/subtraction step S24, aplurality of results obtained in the two-point comparison step S231.When the pattern identification is attained by integrating a pluralityof pieces of information like AdaBoost used in this embodiment, a highperformance tends to be attained since an information volume pertainingto each piece of individual information increases. For this reason,since information volumes of the two-point difference values at thepositions 422 and 423 increase, a higher pattern identificationperformance is more likely to be attained.

In this way, since pieces of feature information are diffused to relaxthe localities of features, pieces of combination information ofsampling data can have various variations of information, thus expectingto improve the pattern identification performance. This effect is acharacteristic feature obtained by the processing for diffusing piecesof feature information to relax the localities of features in thepattern identification method of this embodiment.

Note that this embodiment has explained the method using two-pointdifference values as the pattern identification method using samplingdata. However, the present invention is not limited to this specificmethod. For example, a method of comparing one arbitrary point with athreshold or a method of increasing the number of sampling data to use adifference between a sum of certain two points and that of other twopoints may be used. In this way, the number of sampling data used in thepattern identification can be arbitrarily set. This embodiment hasexemplified the method of identifying whether or not an input patternmatches that of the specific category. However, by applying theaforementioned processing to an input image in a raster-scan manner, apattern of the specific category can be detected from the input image.

Second Embodiment

The second embodiment will explain an example of a patternidentification method which inputs an image obtained by capturing aspecific object, and identifies a direction from which the image of thespecific object was captured, as another embodiment of the patternidentification method described in the first embodiment.

FIG. 5 is a block diagram showing the arrangement of processing of apattern identification method according to the second embodiment. FIG. 6shows the processing sequence of the pattern identification methodaccording to the second embodiment. A registration pattern informationinput unit 500 in FIG. 5 inputs information associated with patternswhich are obtained by capturing images of a specific object from variousdirections. A registration pattern information database 501 records andholds the information associated with the captured patterns. In thisembodiment, a plurality of decision trees which branch based oncomparison of two-point values are used, and final patternidentification is attained by integrating results of these trees unlikein the first embodiment. Hence, the registration pattern informationinput unit 500 executes processing for inputting information of theplurality of decision trees. In this embodiment, a binary tree is usedas the decision tree, and information of one decision tree includes aplurality of pieces of branch node information, and a plurality ofpieces of leaf information.

FIG. 8 shows the structure of the decision tree used in this embodiment.The plurality of pieces of branch node information and the plurality ofpieces of leaf information included in the decision tree of thisembodiment will be described below with reference to FIG. 8. In FIG. 8,branch nodes 800, 801, 802, and 80 n are indicated by circles, and eachbranch node has one set of branch node information. Also, leaves 810 and81 m are indicated by squares, and each leaf has one set of leafinformation.

The branch node information includes a position of a first point andthat of a second point upon comparing luminance values at the twopoints, and information of branch destinations. In actual processing,the magnitude relationship between luminance values of pixels at thepositions of the first and second points is calculated based on thebranch node information, and which of branch destinations the control isto advance is decided according to that result.

In this embodiment, when the luminance value at the position of thefirst point is larger than that at the position of the second point, thecontrol advances to a left branch destination in FIG. 8; otherwise, thecontrol advances to a right branch destination. For example, in case ofthe branch node 800, the information of the branch destinationsindicates that a left branch destination is the branch node 801, and aright branch destination is the branch node 802.

Hence, in the actual processing, values of two points are compared basedon the information of the branch node 800, and when the value at theposition of the first point is larger, the control advances in adirection of the branch node 801; otherwise, the control advances in adirection of the branch node 802. In the following description, such aprocess will be referred to as a search of the decision tree. In thisconnection, the branch node 800 is called a route node, and the searchof the decision tree is always started from this route node.

The leaf information is that of a result in this decision tree, that is,an identification result indicating a direction from which the image ofthe specific object was captured in this embodiment. In this embodiment,pieces of information indicating directions from which images of thespecific object were captured, which are originally continuous pieces ofinformation, are quantized, and are expressed by 162 indices 1 to 162.These 162 indices correspond to respective vertex positions of aso-called geodesic dome having 162 vertices. This means that each of thevertex positions is near a position where an image of the specificobject was captured under the assumption that the specific object islocated at the central position of this geodesic dome. For this reason,the leaf information is simply that of one of numerical values 1 to 162.In this embodiment, one leaf has only one result. In this embodiment,each end of the decision tree is always a leaf. In actual processing,the decision tree is searched until any one of leaves is reached, and anindex value as that leaf information is used as a result of thatdecision tree.

Information of such a decision tree is generated by learning using alarge number of image data obtained by capturing images of the specificobject from various directions. Since this embodiment uses a recognitionmethod which integrates pieces of information of various types ofdecision trees, a plurality of different decision trees have to begenerated. Hence, this embodiment generates a plurality of differentdecision trees using a bagging method. That is, a plurality of subsetsof image data randomly sampled from a large number of image data used inlearning are generated, and decision trees are generated using them.Data used in generation of the decision trees are those which haveundergone edge extraction processing and distance map generationprocessing (to be described later) as in the first embodiment.

In this embodiment, the plurality of different decision trees aregenerated using the bagging method. However, the present invention isnot limited to this, and any other methods may be used as long as theycan generate a variety of decision trees. For example, a method ofgenerating decision trees using the AdaBoost method as in the firstembodiment, and a method of generating a variety of decision trees byrandomly selecting data in respective branch nodes as in a methoddisclosed in Japanese Patent Laid-Open No. 2005-03676 may be used.

The registration pattern information input unit 500 executes processingfor inputting the information of the aforementioned decision tree asmany as the number (=N) of decision trees, and recording and holdingthem in the registration pattern information database 501. Theprocessing in these units correspond to “input registration patterninformation” in step S600 in FIG. 6.

An image input unit 50 inputs an image obtained by capturing that of thespecific object (to be referred to as an “input image” hereinafter) as atarget used to determine a direction from which that image was captured.FIG. 7 shows examples of the input image and processing results. In thisembodiment, a grayscale image 70 in FIG. 7 is input. The processing inthis unit corresponds to “input image” in step S60 in FIG. 6.

An edge extraction processing unit 51 in FIG. 5 is a processing unitwhich generates an edge extraction image from the input image as in theedge extraction processing unit 11 of the first embodiment. In thisembodiment, Canny edge detection or extraction processing is executed asthe edge extraction processing. The processing in this unit correspondsto “edge extraction processing” in step S61 in FIG. 6. An image 71 is anapplication example of this Canny edge extraction processing to theimage 70 in FIG. 7. As shown in the image 71, with the Canny edgedetection processing, a binary image in which positions where an edgecoupled to an existence position of a strong edge exists are “1”, andother positions are “0” is obtained as the edge extraction image. Asdescribed above, the edge extraction processing of this embodimentgenerates the edge extraction image having a distribution in whichextracted features have high localities, that is, feature extractionresults of edges are coarse, as in the edge extraction processing of thefirst embodiment.

Next, a distance map generation processing unit 52 generates a distancemap based on the edge extraction image generated by the edge extractionprocessing unit 51. The processing in this unit is that required togenerate data in which the localities of extracted features are relaxedas in the smoothing processing using the Gaussian filter in thesmoothing processing unit 12 in the first embodiment. This processingcorresponds to “distance map generation processing” in step S62 in FIG.6. The distance map is a map having distances to nearest-neighborfeatures as values of respective positions. More specifically, in thismap, a feature of an edge which exists at the nearest position issearched at each position on the image, and a distance to the foundnearest-neighbor feature is set as a value of that position.

FIG. 12 shows a practical example of the distance map. For example, at aposition 120 shown in FIG. 12, an edge located at a position 121 is anearest feature. If a distance from the position 120 to the position 121is d1, a distance map value of a position corresponding to the position120 is d1. Note that such a normal distance map may be used. However, inthis embodiment, an upper limit value is set in advance for values ofthe distance map, and the map is generated in such a manner that when adistance to a nearest-neighbor feature is equal to or larger than thisupper limit value, the upper limit value is set as a value at thatposition.

For example, in case of a position 122 shown in FIG. 12, an edge locatedat a position 123 is a nearest feature, and when a distance from theposition 122 to the position 123 is d2, and d2 is larger than apredetermined upper limit value DT, the constant value DT is set as avalue of the distance map at a position corresponding to the position122. With the processing using such an upper limit value, a value of aposition, which is separated from a feature existence position by adistance equal to or larger than the set upper limit value, can besuppressed to a constant value.

FIGS. 13A and 13B illustrate an edge extraction image and correspondingdistance map. For the sake of simplicity, FIGS. 13A and 13B exemplifyone-dimensional data. As shown in FIG. 13A, the edge extraction imagehas a rod-like characteristic 1310, and an edge is extracted. As shownin FIG. 13B, the distance map generated in this case indicates that avalue at the position of a characteristic 1320 corresponding to that ofthe characteristic 1310 in FIG. 13A at which the edge is extractedassumes “0”. The characteristic 1320 assumes a larger value inproportion to the distance with increasing distance from that “0”position, and then assumes a constant value D_(T) from a position atwhich a distance=D_(T). Then, as shown in FIGS. 13A and 13B, pieces ofinformation of two-point combinations at positions 1322 and 1323corresponding to positions 1312 and 1313, that is, the magnituderelationships between luminance values of pixels at the two-pointpositions in this embodiment have a difference.

As a quantitative information difference, the two-point combination atthe positions 1322 has information indicating that a feature is morelikely to exist in a direction of the right point of the two points.Also, the two-point combination at the positions 1323 has informationindicating that no feature is more likely to exist in the vicinity ofthe two points at the positions 1323, thus generating a difference.

When the aforementioned upper limit value is not set in the informationof the magnitude relationship between luminance values at two pointsused in this embodiment, pieces of information of the two-pointcombinations at the positions 1322 and 1323 have no difference unlessthere is an influence of a feature which exists at a position other thanthat corresponding to the position 1310. However, in practice, sincethere are influences of features which exist at various positions, if noupper limit value is set, pieces of information often have differences.That is, when no upper limit value is set, information indicating thatno feature is more likely to exist in the vicinity of the two points atthe positions 1323 cannot be obtained from the two-point combination atthe positions 1323.

In this way, as a result of the processing which is set with the upperlimit value in advance, information indicating that no feature is morelikely to exist in the vicinity of given positions can be obtained. Thisupper limit value can be set based on, for example, the characteristicsof the standard feature distribution of a registration pattern as in thefirst embodiment. More specifically, for example, the upper limit valuecan be set to have a value ½ of the distance between features of twopoints, which are considered to be sufficiently distant positions.Alternatively, this upper limit value may be set based on a distancebetween two points to be compared in the decision tree. In this case,the upper limit value may be set to be an intermediate value betweenminimum and maximum values of a distance between two points to becompared in the decision tree.

Note that as in the first embodiment, a parameter used in the conversionrequired to relax degrees of locality of features, that is, the upperlimit value of the distance map is set in advance as a fixed parameter.Alternatively, this parameter may be dynamically set depending on dataas a processing target of the pattern identification. For example, theparameter can be dynamically set according to the total number of pointsextracted as edges from an input image as a processing target of thepattern identification.

As this upper limit value, the same value need not always be used at allpositions. For example, the upper limit values may be set according tolocal distribution characteristics at respective positions in the edgeextraction image generated by the edge extraction processing unit 51.More specifically, a density of the edge extraction image within aregion in the vicinity of an arbitrary position, for example, a regionhaving a radius of r pixels, is calculated. When the density is high, arelatively small upper limit value may be set at that position;otherwise, a relatively large upper limit value may be set. In this way,individual upper limit values can be set at respective positions.

An image 72 in FIG. 7 is an example of a distance map generated based onthe edge extraction image 71. As shown in this example, in the distancemap generated in this case, a position where an edge is extracted in theedge extraction image, for example, a white (a value=1) position in theimage 71 is expressed by black (a value=0) and assumes a value whichbecomes higher with increasing distance from the edge position. At aposition which is separated from the nearest edge by a distance equal toor larger than the upper limit value, a constant value (upper limitvalue) is set.

In this manner, the distance map generated by this processing is datawhich have various variations of values even at positions where nofeature exists and in which the localities of features are relaxedcompared to the source edge extraction image as in the edge distributiondiffused image in the first embodiment. For this reason, the samecharacteristic effects as those described in the first embodiment can beobtained. Note that this embodiment uses a normal Euclidean distance asa distance measure upon generation of the distance map. However, thepresent invention is not limited to this. For example, a distancemeasure such as a Manhattan distance may be used. Also, this embodimentgenerates the distance map based on distances to nearest-neighborfeatures. Alternatively, for example, the distance map may be generatedbased on an average value of distances to a plurality of features whichexist at neighboring positions.

As in the first embodiment, the processes in the edge extractionprocessing unit 51 and distance map generation processing unit 52 arerespectively applied to a large number of images used in generation ofthe decision trees, and the decision trees are generated using dataafter application of these processes. In this case, as in the firstembodiment, it is desired that the parameter of these processes appliedto a large number of images used in generation of the decision trees,for example, the upper limit value setting, matches that of processes tobe applied in an actual pattern identification stage.

Upon completion of the generation processing of the distance map in thedistance map generation processing unit 52 in FIG. 5, patternidentification processing is executed using this distance map. Adecision tree search processing unit 53 in FIG. 5 searches the pluralityof decision trees held in the registration pattern information database501 to calculate one identification result per decision tree, that is,one of indices 1 to 162, and sends the identification result to a scoreaddition unit 54. The score addition unit 54 in FIG. 5 adds “1” to anaccumulated score of an index corresponding to the result sent from thedecision tree search processing unit 53.

The processing sequence in the decision tree search processing unit 53and score addition unit 54 will be described below with reference toFIG. 6. In step S630, the decision tree search processing unit 53initializes the number of decision trees to “0”, and also all elementsof an accumulated score to “0”. In this case, the accumulated score inthis embodiment is a table having 162 elements, and respective elementscorrespond to the aforementioned indices 1 to 162 as identificationresults.

In step S631, the decision tree search processing unit 53 selectsinformation of one decision tree in turn from those of the plurality ofdecision trees held in the registration pattern information database 501(the selection order can be arbitrarily set, but redundant selection isavoided). Then, the decision tree search processing unit 53 comparesluminance values of pixels at two points on the distance map based onthe information of the selected decision tree to search the decisiontree until one leaf is reached, thus obtaining one index as anidentification result.

In step S64, the score addition unit 54 adds “1” to an element of theaccumulated score corresponding to that index based on the result instep S631. For example, if an index as an identification result is k,the score addition unit 54 executes processing for adding “1” to thek-th element of the accumulated score as the table. After that, thescore addition unit 54 increments the number of decision trees by “1” instep S632.

Then, the score addition unit 54 determines in step S633 whether or notthe number of decision trees reaches the total number (N) of decisiontrees held in the registration pattern information database 501. If thenumber of decision trees reaches the total number N, the processadvances to the next pattern determination step S65. If the number ofdecision trees does not reach the total number N, the process returns tostep S631 to select information of a new decision tree from theregistration pattern information database 501, thus repeating theaforementioned processes. The processing sequence in the decision treesearch processing unit 53 and score addition unit 54 has been described.

Finally, a pattern determination unit 55 in FIG. 5 extracts an elementas a maximum score from those of the accumulated score added by thescore addition unit 54, and obtains an index corresponding to thatelement. Then, the pattern determination unit 55 determines that theimage input by the image input unit 50 in FIG. 5, that is, the imageobtained by capturing that of the specific object is that obtained bycapturing the image of the specific object from a directioncorresponding to the obtained index. The processing in this unitcorresponds to “pattern determination” in step S65 in FIG. 6. Uponcompletion of this processing, the processing of the patternidentification method of this embodiment ends.

As described above, in the pattern identification method of thisembodiment, features, which exist on input data, are extracted, and adistance map in which the localities of the extracted features arerelaxed is generated by the distance map generation processing. Then, aplurality of decision trees in which branch processing is executed atrespective branch nodes using sampling data in this distance map areused, and their results are integrated, thus implementing the patternidentification processing. In this manner, high-speed patternidentification, which is robust against various variations of inputdata, can be attained.

Note that this embodiment has explained the method using two-pointcomparison results as the pattern identification method using samplingdata. As in the first embodiment, the present invention is not limitedto this. For example, a method of executing branch processing based on acomparison result between a value of a certain point and a predeterminedthreshold in each branch node of the decision tree, and a method ofexecuting branch processing based on a plurality of data may be used. Inthis embodiment as well, the number of sampling data used in the patternidentification can be arbitrarily set.

This embodiment has exemplified the method which inputs an imageobtained by capturing that of a specific object, and identifies adirection from which the image of that specific object was captured.However, the arrangement of this embodiment is not limited to suchspecific use purpose. For example, this embodiment is applicable to amethod of identifying whether or not input data matches a specificcategory (for example, whether or not an input image is a face pattern),as in the first embodiment. Also, this embodiment is applicable to amethod of detecting a pattern of a specific category from an input imageby raster-scan processing, as in the first embodiment.

Third Embodiment

The third embodiment will exemplify a pattern identification methodwhich inputs an image obtained by capturing that of a specific object,and identifies a direction from which the image of that specific objectwas captured, as in the second embodiment.

FIG. 9 shows the functional arrangement of a pattern identificationapparatus according to the third embodiment. FIG. 10 shows theprocessing sequence of a pattern identification method according to thisembodiment. An example of the pattern identification apparatus will bedescribed below with reference to FIGS. 9 and 10, but a description ofthe same parts as those in the second embodiment will not be repeated.

A registration pattern information input unit 900 is the same as theregistration pattern information input unit 500 in the secondembodiment. In this case, the registration pattern information inputunit 900 inputs pieces of information of a plurality of decision trees.A registration pattern information database 901 is also the same as theregistration pattern information database 501, and records and holds theplurality of decision trees. The registration pattern information inputunit 900 corresponds to “input registration pattern information” in stepS1000 in FIG. 10.

The plurality of decision trees used in this embodiment are basicallythe same as those in the second embodiment, except for pieces ofinformation of two points to be compared by a branch node. In thisembodiment, a plurality of feature distribution diffused maps aregenerated from a single input image by processes in a featuredistribution map generation processing unit 911 and smoothing processingunit 92 (to be described later). For this purpose, pieces of informationof two points to be compared by a branch node of this embodiment includeinformation indicating which of feature distribution diffused maps andwhich position each point corresponds to. Since the decision trees arethe same as those in the second embodiment except for this difference, adescription thereof will not be repeated. Also, data used in generationof a decision tree are the same as those in the second embodiment,except that data after application of corner edge extraction processing,feature distribution map generation processing, and smoothing processingare used.

Processing in an image input unit 90 in FIG. 9 corresponds to “inputimage” in step S1001, and is the same as that in the image input unit 50of the second embodiment. Hence, a description thereof will not berepeated. A corner detection processing unit 910 applies cornerdetection processing to an input image to calculate positions of cornerswhich exist on the input image.

This embodiment uses Harris corner detection processing as the cornerdetection processing. The processing in this unit corresponds to “cornerdetection processing” in step S1010 in FIG. 10. An application exampleof this Harris corner detection processing to an image 110 in FIG. 11 isan image 111 in FIG. 11. In the image 111 in FIG. 11, positionsindicated by open circles are those detected as corners. Note that thisembodiment detects corners using the Harris corner detection processing.However, the present invention is not limited to this. For example,another corner detection processing such as Moravec corner detectionprocessing may be used.

Next, the feature distribution map generation processing unit 911calculates luminance gradient directions at respective corner positionsdetected by the corner detection processing unit 910, and generates aplurality of feature distribution maps based on the calculation results.In this case, each feature distribution map corresponds to each ofranges obtained by dividing the gradation directions from 0° to 180°into a plurality of regions. In this embodiment, the gradient directionsfrom 0° to 180° are divided into six 30° regions. Therefore, one featuredistribution map exists for each of ranges from 0° to less than 30°,from 30° to less than 60°, . . . , from 150° to less than 180°.

FIG. 14 shows the sequence of the feature distribution map generationprocessing in this unit. The generation processing of a plurality offeature distribution maps in the feature distribution map generationprocessing unit 911 in FIG. 9 will be described below with reference toFIG. 14. In step S140, the feature distribution map generationprocessing unit 911 initializes values of all positions on the pluralityof feature distribution maps to “0”.

In step S141, the feature distribution map generation processing unit911 selects one corner in turn from a plurality of corners detected bythe corner detection processing unit 910 (the selection order can bearbitrarily set, but redundant selection is avoided). Subsequently, instep S142, the feature distribution map generation processing unit 911calculates a luminance gradient direction of the input image at thecorner position selected in step S141.

The luminance gradient direction is calculated using a Sobel filter inthis embodiment. More specifically, letting Gx be a horizontal Sobelfilter output at the corner position for the input image, and Gy be avertical Sobel filter output, a gradient direction θ is given by θ=atan(Gy/Gx) where a tan is an inverse function of a tangent function, andhas a codomain from 0° to 180°.

In step S143, the feature distribution map generation processing unit911 selects a feature distribution map corresponding to a region ofthose which are obtained by dividing the directions from 0° to 180° intothe six regions, and to which the luminance gradient directioncalculated in step S142 belongs. In step S144, the feature distributionmap generation processing unit 911 sets “1” in a value of a positioncorresponding to the corner position selected in step S141 in thefeature distribution map selected in step S143. The feature distributionmap generation processing unit 911 determines in step S145 whether ornot all corners detected by the corner detection processing unit 910have been selected in step S141. If all the corners have been selected,this feature distribution map generation processing ends. If all thecorners have not been selected yet, the process returns to the cornerselection step S141 to select a new corner, thus repeating theaforementioned processes.

That is, by applying the processes in steps S141 to S144 to all thecorners detected by the corner detection processing unit 910, thisfeature distribution map generation processing, that is, the generationprocessing of the plurality of feature distribution maps in the featuredistribution map generation processing unit 911 ends. The processing inthis unit corresponds to “feature distribution map generationprocessing” in step S1011 in FIG. 10.

Each of the plurality of feature distribution maps generated by thisprocessing has a very coarse distribution in which only corner positionsas luminance gradient directions within a range corresponding to thatfeature distribution map of the corners detected by the corner detectionprocessing unit 910 are “1”, and other positions are “0”. Morespecifically, for example, the number of positions detected as cornersis 46 shown in the image 111 in FIG. 11, and the number of cornerpositions corresponding to the luminance gradient directions from 0° toless than 30° of these corner positions is 10. As a result, the featuredistribution map corresponding to the range from 0° to less than 30° hasa very coarse distribution in which only 10 points in the map are “1”,and other positions are “0”. Therefore, these features also have highlocalities.

For this reason, as described above, when the pattern identificationusing sampling data is executed using these feature distribution mapsintact, a sufficiently high identification performance cannot often beattained. Hence, in this embodiment as well, the smoothing processingunit 92 in FIG. 9 applies a conversion required to relax the localitiesof features to these plurality of feature distribution maps.

As described above, the smoothing processing unit 92 applies theconversion required to relax the localities of features to the pluralityof feature distribution maps generated by the feature distribution mapgeneration processing unit 911, thereby generating a plurality offeature distribution diffused maps. In this embodiment, the smoothingprocessing unit 92 applies two different conversions required to relaxthe localities of features in two stages.

FIG. 15 shows the processing sequence of the smoothing processing unit92. In FIG. 15, spatial filtering processing in step S155 corresponds tothe first conversion stage of the two conversion stages. Also, luminancegradient direction spatial filtering processing in step S156 latercorresponds to the second conversion stage of the two conversion stages.

The two conversions required to relax the localities of features, whichare executed by this smoothing processing unit 92, are the same as thatby means of smoothing processing in the first embodiment. However, inthe first conversion stage, a parameter used in the conversion is notfixed, but it is dynamically set depending on data as a processingtarget of the pattern identification, unlike in the parameter setting ofthe first embodiment. Processing for dynamically setting the parameterused in the conversion depending on processing target data correspondsto repetitive processes in steps S151 to S153 in FIG. 15. In the case ofFIG. 10, this processing corresponds to the “conversion parameterdecision processing” in step S1020.

Generation processing of a plurality of feature distribution diffusedmaps in this smoothing processing unit 92 will be described below withreference to FIG. 15. In step S150, the smoothing processing unit 92initializes values of all positions on the plurality of featuredistribution diffused maps to “0”. In step S151, the smoothingprocessing unit 92 selects one corner in turn from the plurality ofcorners detected by the corner detection processing unit 910 (theselection order can be arbitrarily set, but redundant selection isavoided).

In step S152, the smoothing processing unit 92 calculates the number(=C) of corners detected in a region of a predetermined radius havingthe corner selected in step S151 as the center, for example, in a regionof a radius of about 1/10 of the image width. Next, in step S153, thesmoothing processing unit 92 sets a spatial filter used at the cornerposition selected in step S151 based on the number C of cornerscalculated in step S152 using:

$\begin{matrix}{{\exp\left( {- \frac{x^{2} + y^{2}}{2\sigma^{2}}} \right)},{\sigma = \frac{\alpha}{\sqrt{C}}}} & (1)\end{matrix}$

where x and y indicate positions in the spatial filter when the centerof the spatial filter is defined as an origin. Also, α is a constantassociated with the width of the spatial filter. This α can beempirically set in advance to have a value that can prevent neighboringpieces of feature information from being mixed up by the spatialfiltering processing in this step. In this embodiment, the value of α isset to be ¾ of the radius. Using this spatial filter, when the number ofdetected corners is small, that is, when C is small, the width of thespatial filter to be applied is set to be large. Conversely, when thenumber of detected corners is large, the width of the spatial filter isset to be small.

The smoothing processing unit 92 determines in step S154 whether or notall the corners detected by the corner detection processing unit 910have been selected in step S151. If all the corners have been selected,the process advances to spatial filtering processing in step S155. Ifall the corners have not been selected yet, the process returns to thecorner selection step S151 to select a new corner, thus repeating theaforementioned processes. That is, the processes in steps S151 to S153are applied to all the corners detected by the corner detectionprocessing unit 910, and the process then advances to the next spatialfiltering processing step S155.

With the processes executed so far, for all the corners detected by thecorner detection processing unit 910, spatial filters according to thenumbers of corners which exist within their neighboring regions (regionsof a predetermined radius) are set. As described above, in thisembodiment, the parameter in the conversion required to relax thelocalities of features, that is, the spatial filter used in thesmoothing processing of this embodiment is dynamically set depending ondata as a processing target of the pattern identification.

In step S155, the smoothing processing unit 92 executes so-calledprojected field type spatial filtering processing using the featuredistribution maps generated by the feature distribution map generationprocessing unit 911, thereby attaining the conversion required to relaxthe localities of features.

Normal receptive field type spatial filtering processing executesprocessing for setting a value obtained by weighting and adding a valueof a corresponding position of filtering application target data by avalue of a spatial filter used in the filtering processing as a value ofeach position after the filtering processing. By contrast, the projectedfield type spatial filtering processing executes the followingprocessing. That is, a value of filtering application target data (thefeature distribution map generated by the feature distribution mapgeneration processing unit 911 in this case) is weighted by a value of aspatial filter used in the filtering processing. Then, the weightedspatial filter value is added in turn to a value of a correspondingposition after the filtering processing (a value of each position of thefeature distribution diffused map in this case).

Such projected field type spatial filtering processing need only beexecuted at a position where a signal exists (a position having avalue=1 on the feature distribution map) on the filtering applicationtarget data when the filtering application target data is very coarse.Therefore, high-speed processing can be attained. More specifically, theaforementioned projected field type spatial filtering processing can beapplied to only the feature distribution maps in which values are “1” atthe corner positions detected by the corner detection processing unit910 using the spatial filters set in step S153.

The spatial filter used in this projected field type spatial filteringis set according to equation (1) above according to the number ofcorners which exist in its neighboring region, as described above. Usingsuch a spatial filter, when the number of detected corners is small inthe neighboring region, that is, when C is small, the width of thespatial filter to be applied is set to be large. Conversely, when thenumber of corners is large, the width of the spatial filter is set to besmall. For this reason, in a region where corners exist densely, thesepieces of feature information can be diffused by a small width so as notto be mixed up with other pieces of feature information. In a regionwhere corners exist coarsely, these pieces of feature information can bediffused by a large width.

As described above, in the pattern identification method according tothe third embodiment, the conversion required to relax the localities offeatures may be set according to each individual position, for example,a feature distribution near that position. In this embodiment, the widthof the spatial filter is changed according to a feature distributionnear a position of interest. However, the present invention is notlimited to this. For example, a shape of the spatial filter may bechanged according to a principal axis direction of a distribution. Also,in this embodiment, the spatial filter is set according to the number offeatures in the vicinity of a position of interest, that is, the densityof a feature distribution in a neighboring region. However, the presentinvention is not limited to this. For example, the spatial filter may beset based on a distance to a nearest-neighbor feature (except foritself).

FIG. 11 shows examples of the plurality of feature distribution diffusedmaps. Feature distribution diffused maps 1121 to 1126 are examples ofthose which have undergone the spatial filtering processing in stepS155, that is, feature locality relaxing processing in the first stage.The feature distribution diffused map 1121 is a result obtained byrelaxing the localities of features in a position space for a featuredistribution map (not shown) corresponding to a luminance gradientdirection range from 0° to less than 30° by the processing in step S155.Likewise, the feature distribution diffused maps 1122 to 1126 areprocessing results respectively corresponding to a range from 30° toless than 60° to that from 150° to less than 180°. As shown in FIG. 11,by the processing in step S155, pieces of feature information aredistributed by a small width at dense corner distribution positions, andthey are diffused by a large width at coarse corner distributionpositions. Thus, pieces of feature information at near positions can beappropriately prevented from being mixed up, and hems of pieces offeature information, which are diffused to some extent, can reachpositions where no feature exists, thus greatly relaxing the localitiesof features compared to original data.

However, for example, the feature distribution diffused map 1122 in FIG.11 includes many regions where no feature signal exists, and thelocalities in this map are not sufficiently relaxed. Hence, as thesecond conversion stage required relaxing the localities of features,luminance gradient direction spatial filtering processing for alsodiffusing pieces of feature information in a luminance gradientdirection space is executed in turn in step S156, thus also relaxing thelocalities of features in the luminance gradient direction space.

More specifically, by weighting the processing results in step S155corresponding to the neighboring luminance gradient direction ranges bypredetermined values, and adding the weighted results, pieces of featureinformation are diffused in the luminance gradient direction space.Practical processing results in this case are feature distributiondiffused maps 1131 to 1136 shown in FIG. 11.

The sequence of processing for generating a result of the featuredistribution diffused map 1133 corresponding to a luminance gradientdirection range from 60° to less than 90° of the processing results instep S156 will be described below. The result of the featuredistribution diffused map 1123 corresponding to the same luminancegradient direction range is added by weighting it by 1.0 intact(corresponding to a bold downward arrow in FIG. 11). The results of thefeature distribution diffused maps 1122 and 1124 corresponding to theneighboring luminance gradient direction ranges are weighted by 0.6, andthe products are added (corresponding to oblique solid arrows). Finally,the results of the feature distribution diffused maps 1121 and 1125corresponding to luminance gradient direction ranges still distant byone range are weighted by 0.2, and the products are added, thusobtaining the feature distribution diffused map 1133 in which pieces offeature information are also diffused in the luminance gradientdirection space.

In this embodiment, since 0° and 180° of luminance gradient directionsmatch, for example, luminance gradient direction ranges which neighborthe range from 0° to less than 30° include the range from 30° to lessthan 60° and that from 150° to less than 180°. That is, the luminancegradient directions of the feature distribution diffused map 1131 areobtained by adding the feature distribution diffused map 1121 weightedby 1.0, the feature distribution diffused maps 1122 and 1126 weighted by0.6, and the feature distribution diffused maps 1123 and 1125 weightedby 0.2. In this way, the conversion required to relax the localities offeatures in the present invention is not limited to that required torelax the localities in position spaces, but it may be a conversionrequired to relax the localities in other spaces.

The feature distribution diffused maps 1131 to 1136 in FIG. 11, whichare generated by the processing in step S156, are examples of theplurality of feature distribution diffused maps to be generated by thesmoothing processing unit 92 in FIG. 9. In this manner, the smoothingprocessing unit 92 executes the parameter decision processing in theconversion in the first stage, and the conversions in the first andsecond stages, thereby generating the plurality of feature distributiondiffused maps. The processing in this unit corresponds to steps S1020and S1021 in FIG. 10. With this processing, as shown in the featuredistribution diffused maps 1131 to 1136 in FIG. 11, the localities offeatures can be greatly relaxed as compared to the plurality of originalfeature distribution maps. In this manner, the same characteristiceffects as those described in the first embodiment can be obtained.

The subsequent processing units, that is, a decision tree searchprocessing unit 93, score addition unit 94, and pattern determinationunit 95 are substantially the same as those, that is, the decision treesearch processing unit 53, score addition unit 54, and patterndetermination unit 55 in the second embodiment, except that processingtarget data in the decision tree search processing unit 93 are theplurality of feature distribution diffused maps in place of the distancemap, as compared to the second embodiment. Hence, a description of theseunits will not be given.

As described above, in the pattern identification method of thisembodiment, features, which exist on input data, are extracted. Then,the aforementioned processing for diffusing pieces of featureinformation in the two stages of position spaces and luminance gradientdirection spaces is executed, thereby generating a plurality of featuredistribution diffused maps in which the localities of the extractedfeatures are relaxed. Also, except for use of the plurality of featuredistribution diffused maps, as in the second embodiment, a plurality ofdecision trees which execute branch processing using sampling data inthe plurality of feature distribution diffused maps in respective branchnodes are used, and their results are integrated to implement thepattern identification processing. In this way, the high-speed patternidentification which is robust again various variations can be attained.

Note that this embodiment extracts a feature at a point of interestbased on the result of the corner detection processing. However, thepresent invention is not limited to this. For example, other featuredetection methods such as a method based on a blob detection resultusing Difference of Guassians (DoG) may be used. As in otherembodiments, as the pattern identification method using sampling data, amethod using a comparison result between a value of one point and apredetermined threshold may be used in addition to the method usingtwo-point comparison results. Also, the arrangement of this embodimentis applicable to a method of identifying whether or not input datamatches a specific category, and a method of detecting a pattern of aspecific category from an input image by raster-scan processing.

According to the present invention, high-speed pattern identificationcan be attained using sampling data in data, so as to be robust againstvarious variations.

Other Embodiments

Aspects of the present invention can also be realized by a computer of asystem or apparatus (or devices such as a CPU or MPU) that reads out andexecutes a program recorded on a memory device to perform the functionsof the above-described embodiment(s), and by a method, the steps ofwhich are performed by a computer of a system or apparatus by, forexample, reading out and executing a program recorded on a memory deviceto perform the functions of the above-described embodiment(s). For thispurpose, the program is provided to the computer for example via anetwork or from a recording medium of various types serving as thememory device (for example, computer-readable storage medium).

While the present invention has been described with reference toexemplary embodiments, it is to be understood that the invention is notlimited to the disclosed exemplary embodiments. The scope of thefollowing claims is to be accorded the broadest interpretation so as toencompass all such modifications and equivalent structures andfunctions.

This application claims the benefit of Japanese Patent Application No.2010-198262 filed on Sep. 3, 2010, which is hereby incorporated byreference herein in its entirety.

1.-13. (canceled)
 14. An information processing apparatus comprising: aholding unit configured to hold, based on a pixel value obtained from alearning image that belongs to any of a plurality of categories, aclassified result of a plurality of learning images and a position wherethe pixel value is to be obtained; an input unit configured to input animage; a first generation unit configured to generate a first imageindicating characteristic distribution of the input image; an obtainingunit configured to obtain a parameter with regard to a conversionrequired to relax localities of the first image, wherein the parameteris determined in advance; a second generation unit configured togenerate a second image by applying the conversion having the determinedparameter to the first image; a selecting unit configured to select aplurality of individual pixels from the second image based on theposition where the pixel value is to be obtained; a comparing unitconfigured to compare a plurality of pixel values of the selectedplurality of individual pixels; and a determination unit configured todetermine, for the input image, a corresponding category of the learningimage based on a result of the comparing unit.
 15. The apparatusaccording to claim 14, wherein the input image and the learning imageinclude an object, and wherein the category is information representingwhich direction the object is observed from.
 16. The apparatus accordingto claim 14, wherein the first image represents a distribution of edgesincluded in the input image.
 17. The apparatus according to claim 16,further comprising an edge extraction unit configured to extract theedges, wherein the first generation unit generates a first imagerepresenting a distribution of the edges, based on a result of the edgeextraction unit.
 18. The apparatus according to claim 14, wherein theinput image includes an object, and wherein the parameter ispredetermined based on an image including the object.
 19. The apparatusaccording to claim 14, wherein the input unit inputs a plurality ofpartial images extracted from one image, and wherein the secondobtaining unit, the selecting unit, the comparing unit, and thedetermination unit perform each processing to each of the inputplurality of partial images, the apparatus further comprising: a seconddetermination unit configured to determine a category of an imageconstituted from the partial images, based on each category determinedto each of the input plurality of partial images.
 20. An informationprocessing apparatus comprising: a holding unit configured to hold,based on a pixel value obtained from a learning image that belongs toany of a plurality of categories, a classified result of a plurality oflearning images and a position where the pixel value is to be obtained;an input unit configured to input an image; a first generation unitconfigured to generate a first image indicating characteristicdistribution from the input image; an obtaining unit configured toobtain a parameter with regard to a conversion required to relaxlocalities of the first image, wherein the parameter is determined inadvance; a second generation unit configured to generate a second imageby applying the conversion having the determined parameter to the firstimage; a selecting unit configured to select a individual pixel from thesecond image based on the position where the pixel value is to beobtained; a comparing unit configured to compare a pixel value of theselected individual pixel with a threshold value; and a determinationunit configured to determine, for the input image, a correspondingcategory of the learning image based on a result of the comparing unit.21. The apparatus according to claim 20, wherein the input image and thelearning image include an object, and wherein the category isinformation representing which direction the object is observed from.22. The apparatus according to claim 20, wherein the first imagerepresents a distribution of edges included in the input image.
 23. Theapparatus according to claim 22, further comprising an edge extractionunit configured to extract the edges, wherein the first generation unitgenerates a first image representing a distribution of the edges, basedon a result of the edge extraction unit.
 24. The apparatus according toclaim 20, wherein the input image includes an object, and wherein theparameter is predetermined based on an image including the object. 25.The apparatus according to claim 20, wherein the input unit inputs aplurality of partial images extracted from one image, and wherein thesecond obtaining unit, the selecting unit, the comparing unit, and thedetermination unit perform each processing to each of the inputplurality of partial images, the apparatus further comprising: a seconddetermination unit configured to determine a category of an imageconstituted from the partial images, based on each category determinedto each of the input plurality of partial images.
 26. A method forcontrolling an information processing apparatus comprising: holding,based on a pixel value obtained from a learning image that belongs toany of a plurality of categories, a classified result of a plurality oflearning images and a position where the pixel value is to be obtained;inputting an image; generating a first image indicating characteristicdistribution of the input image; obtaining a parameter with regard to aconversion required to relax localities of the first image, wherein theparameter is determined in advance; generating a second image byapplying the conversion having the determined parameter to the firstimage; selecting a plurality of individual pixels from the second imagebased on the position where the pixel value is to be obtained; comparinga plurality of pixel values of the selected plurality of individualpixels; and determining, for the input image, a corresponding categoryof the learning image based on a result of the comparing.
 27. Anon-transitory computer readable storage medium storing a computerprogram which, when executed on a computer, causes the computer toexecute the steps of a method for controlling an information processingapparatus comprising: holding, based on a pixel value obtained from alearning image that belongs to any of a plurality of categories, aclassified result of a plurality of learning images and a position wherethe pixel value is to be obtained; inputting an image; generating afirst image indicating characteristic distribution of the input image;obtaining a parameter with regard to a conversion required to relaxlocalities of the first image, wherein the parameter is determined inadvance; generating a second image by applying the conversion having thedetermined parameter to the first image; selecting a plurality ofindividual pixels from the second image based on the position where thepixel value is to be obtained; comparing a plurality of pixel values ofthe selected plurality of individual pixels; and determining, for theinput image, a corresponding category of the learning image based on aresult of the comparing.
 28. A method for controlling an informationprocessing apparatus comprising: holding, based on a pixel valueobtained from a learning image that belongs to any of a plurality ofcategories, a classified result of a plurality of learning images and aposition where the pixel value is to be obtained; inputting an image;generating a first image indicating characteristic distribution from theinput image; obtaining a parameter with regard to a conversion requiredto relax localities of the first image, wherein the parameter isdetermined in advance; generating a second image by applying theconversion having the determined parameter to the first image; selectinga individual pixel from the second image based on the position where thepixel value is to be obtained; comparing a pixel value of the selectedindividual pixel with a threshold value; and determining, for the inputimage, a corresponding category of the learning image based on a resultof the comparing.
 29. A non-transitory computer readable storage mediumstoring a computer program which, when executed on a computer, causesthe computer to execute the steps of a method for controlling aninformation processing apparatus comprising: holding, based on a pixelvalue obtained from a learning image that belongs to any of a pluralityof categories, a classified result of a plurality of learning images anda position where the pixel value is to be obtained; inputting an image;generating a first image indicating characteristic distribution from theinput image; obtaining a parameter with regard to a conversion requiredto relax localities of the first image, wherein the parameter isdetermined in advance; generating a second image by applying theconversion having the determined parameter to the first image; selectinga individual pixel from the second image based on the position where thepixel value is to be obtained; comparing a pixel value of the selectedindividual pixel with a threshold value; and determining, for the inputimage, a corresponding category of the learning image based on a resultof the comparing.